Crosslingual Induction of Semantic Roles

نویسندگان

  • Ivan Titov
  • Alexandre Klementiev
چکیده

We argue that multilingual parallel data provides a valuable source of indirect supervision for induction of shallow semantic representations. Specifically, we consider unsupervised induction of semantic roles from sentences annotated with automatically-predicted syntactic dependency representations and use a stateof-the-art generative Bayesian non-parametric model. At inference time, instead of only seeking the model which explains the monolingual data available for each language, we regularize the objective by introducing a soft constraint penalizing for disagreement in argument labeling on aligned sentences. We propose a simple approximate learning algorithm for our set-up which results in efficient inference. When applied to German-English parallel data, our method obtains a substantial improvement over a model trained without using the agreement signal, when both are tested on non-parallel sentences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Driving inversion transduction grammar induction with semantic evaluation

We describe a new technique for improving statistical machine translation training by adopting scores from a recent crosslingual semantic frame based evaluation metric, XMEANT, as outside probabilities in expectation-maximization based ITG (inversion transduction grammars) alignment. Our new approach strongly biases early-stage SMT learning towards semantically valid alignments. Unlike previous...

متن کامل

A semantically confidence-weighted ITG induction algorithm

We propose a new algorithm to induce inversion transduction grammars, in which a crosslingual semantic frame based objective function is injected as confidence weighting in the early stages of statistical machine translation training. Unlike recent work on improving translation adequacy that uses a monolingual semantic frame based objective function to drive the tuning of loglinear mixture weig...

متن کامل

Cross-Lingual Syntactically Informed Distributed Word Representations

We develop a novel cross-lingual word representation model which injects syntactic information through dependencybased contexts into a shared cross-lingual word vector space. The model, termed CLDEPEMB, is based on the following assumptions: (1) dependency relations are largely language-independent, at least for related languages and prominent dependency links such as direct objects, as evidenc...

متن کامل

Multilingual Training of Crosslingual Word Embeddings

Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine infor...

متن کامل

ITRI-03-13 CROCODIAL: Crosslingual Computer-mediated Dialogue

We describe a novel approach to crosslingual dialogue which allows for highly accurate communication of semantically complex content. The approach is introduced through an application in a B2B scenario. We are currently building a browser-based prototype for this scenario. The core technology underlying the approach is natural language generation. We also discuss how the proposed approach can c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012